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Abstract 

< 

Wc give a short proof of the Cauchy-Binet determinantal formula using multilinear 
algebra by first generalizing it to an identity not involving determinants. By extending 
the formula to abstract Hilbcrt spaces we obtain, as a corollary, a generalization of the 
classical Parseval identity. 
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in ! 1 Introduction and overview 
O 

m 

The classical Cauchy-Binet formula states that if A, B are two matrices over K (or any field) 
of sizes n x N , N x n, respectively, with n < N, then 



det(AB) = det(Ar) det(B a ) (1) 

a 

where the sum is taken over all a = {a\ < o<i < ■ ■ ■ < a n ), with cr, G {1, . . . , A^}, and where 
A a (respectively B a ) is the nxn submatrix of A (respectively submatrix of B) obtained by 
deleting all columns (respectively all rows) except those with indices in a. 

There are many proofs of this formula, each telling its own story, explaining the formula 
from a different point of view. The most direct way of proving the formula is by writing down 
the determinant as a sum over permutations and performing algebraic manipulations. This 
is the approach taken in many linear algebra books; see, e.g., Marcus and Mine [8, Theorem 
6.1, p. 128] and Gohberg et al. [7, Theorem A. 2.1, p. 651]. A probabilistic interpretation and 
proof of the formula (which starts by using the formula for a determinant) is also available 
[4, 5]. On the other hand, there are many combinatorial proofs. Suffice, perhaps, to refer to 
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the one chosen to be included in the "Proofs from The Book" [2] by Aigner and Ziegler. This 
is a nice proof (after all, it is a proof from The Book) based on the beautiful Gessel-Vienot 
lemma which states that, in a finite weighted acyclic directed graph, the determinant of the 
path matrix between two sets of vertices of cardinality n each equals a sum over all possible 
vertex-disjoint path systems; see [2, Chap. 29, p. 196] and [1] for details. Another very 
simple proof appears in the recent book by Terence Tao [13, p. 298]) on random matrices. 
This proof is based on a relation between the characteristic polynomials of AB and BA. 

On the other hand, it is well-known that the Cauchy-Binet formula is a generalization of 
the Pythagorean theorem. Indeed, let A be a n x N real matrix, n < N, and take B = A T , 
the transpose of A. Since B a = (A T ) a = {A a ) T , the formula gives 

det(AA T ) = ^det(A a ) 2 , 

a 

which can be interpreted geometrically as follows: The parallelotope in ~R. N generated by 
the n row vectors of A has n-dimensional Lebesgue measure y det(AA T ). Therefore the 
formula says that the square of the n-dimensional measure of an n-dimensional parallelotope, 
embedded in a higher-dimensional Euclidean space, equals the sum of the squares of the 
measures of its projections onto all possible n-dimensional coordinate hyperplanes. If n = 1 
this reduces to the Pythagorean theorem. 

The goal of this short article is to give a proof of the Cauchy-Binet formula which is 
as simple as possible, from an algebraic-geometric viewpoint. If n = 1, the Cauchy-Binet 
formula is a triviality: it states that the inner product of two A r -dimensional vectors equals 
the sum of the products of their components: 

N 

(ai . . . , a N ) ■(&!,..., b N ) T = a a b a . 

(7=1 

There is no need to take determinants here, because both sides involve lxl matrices, i.e., 
real numbers. What we show is that the general case, when n > 1, is the same, but on 
bigger vector spaces. In Section 2 we give an account of the ingredients we need, and, in 
Section 3, we state and prove the main formula (Theorem 1) without determinants and in a 
more general setup; a corollary of it is the classical Cauchy-Binet formula. Then, in Section 
4, we see that the formula can be extended to a Hilbert space, giving a generalization of the 
classical Parseval identity. We conclude with a few bibliographic remarks. 

2 The main ingredients 

The main theorem, Theorem 1 below, is requires two ingredients. 

(i) The first is the notion of the determinant of a linear transformation F : X — >■ X on 
a vector space X of dimension d. The dimension of the linear space f\ m X of alternating 
m-linear maps uj : X m — >~ M. is ( ) . For each m, the m-th level dual F* : /\ m X — *- /\ m X 
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of F is defined by 

F*u[x!, ...,x m ] := uj[Fxi, . . .,Fx m }. (2) 

See, e.g., [12]. (Duals obey the standard composition rules: (GF)* = F*G*.) Since f\ d X is 
1-dimensional, the d-th level dual F* is multiplication by a constant. This constant is, by 
definition, the determinant of F: 

F*u = (det F) ■ oj, uj£/\ d X. (3) 



(ii) The second ingredient is very simple too. Let X, Y, Z be vector spaces, and F : 
X — >Y, G : Y — >~Z linear maps. Suppose Y is the direct sum of Y\, . . . , Yk- Let Pi : Y — >Yi, 
1 < i < K , be the projections corresponding to this direct sum (so idy = P\ + ■ ■ ■ + Pk is 
a partition of the identity on V), and let E{ : Yi — >■ Y be the natural embedding of Yj into 
Y. Then, clearly, 

K 

GF = Y J {GE l ){P l F). (4) 

i=l 

See Diagram 1. 



3 An abstract version of the Cauchy-Binet formula 

Let U, V, W be finite-dimensional vector spaces of arbitrary dimensions, and let B : U — >-V, 
A : V — 5- W be two linear maps. Fix n € N and consider the n-th level duals B* : 
f\ n V — ^ f\ n U, A* : f\ n W — s- A™ V. Let N be the dimension of V and let f u . . . , f N be 
a basis for V . See Diagram 2. Denote by S n (N) the set of subsets of {1, ... , N} of size n. 
For each a € S n (N), let V a be the subspace of V spanned by {fi,i € a} and consider the 
direct sum 

v = v a e v w , (5) 

where a := {1, . . . , N} \ a, letting 

P a : V — 

be the projection of V onto V a along and 

E a :V a ^V 

the natural embedding of V a into V. 
Theorem 1. 

(AB)*= (P*B)*(AE )*, (6) 

Proof. The (^f) -dimensional space /\ n V is the direct sum of the 1-dimensional spaces /\ n V a , 
where a ranges in S n (N): 

K l v = ®^ Sn{N) K n v a . (?) 
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Let : f\ n V — >■ /\ n V a be projections corresponding to this direct sum, and let S a : 
/\ n V a — *- /\ n V be natural embedding. Using (4) (with A*, B* in place of F, G, respectively, 
and K = (^)), we have 

B*A* = {B*ga){&aA*). 

See Diagram 2. Since (see Lemma 1 below) 

the theorem follows from the composition rules of the duals. □ 
Lemma 1. 

9 a = E*, S c = P*. 

Proof. We identify S n (N) with the set of strictly increasing sequences of length n with 
values in {1, ... , N}. Thus, if a is a subset of {1, ... , N} we let (a±, . . . , a n ) be a listing of 
its elements in increasing order. To prove the first equality it suffices to show that 

PerUivi, . . . ,V n ] = Uj[E a Vi, . . . , E a V n ], 

for all uj € A n (V) and all v±,..., v n E V a . But then E a Vi = Vi and, since V a is spanned by 
fai ; • • • j fern , it suffices to show that 

^a^ifa^^y ■ ■ ■ , /oV(n)J = w [/oV(i)> • • • J 

where 7r is a permutation of {1, . . . , n}. Since ui = J2 T eS n (N) [this is the partition of 

the identity on /\ n V corresponding to (7)] we may replace u: by & t uj in the last display: 

But then, if r = cr the two sides are obviously equal, and if r ^ a the left-hand side equals 
zero and @> T u[f a<1) f^ (n) ] = 0. 

To prove the second equality it suffices to show that 

S a u[vu ...,v n ] = u)[P a vi, P a v n ], 

for all u) 6 A n (U CT ) and all v\, . . . ,v n G V. But then = u. Since Vi = P a vi + P^Vi 
[corresponding to (5)], we have 

& a u[vi, ...,v n ]= u[P a vi + P w v x , P a v n + P w v n ]. 

Using the multilinearity of u we split the latter into 2 n terms, all of which are zero except 
the one involving only P a Vi as arguments. □ 

Consider now the case where W = U. Moreover, take the number n in Theorem 1 to 
be equal to their common dimension. Assume n < N = dimU to avoid trivialities. Then 
the linear maps (AB)*, (P a B)*, and (AE a )* , appearing in formula (6), are maps between 
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1-dimensional spaces. Since the spaces V a and U have common dimension n, we can identify 
them by means of a linear bijection 

<p a : V a — >- 17. 

Then 

(P CT £)*(^ ff )* = {P.BY^i^YiAE.y = (^PvBYiAE^Y, 

and so 

(ABY= Yl WaPaBYiAE^Y- (8) 

o-SS„(7V) 

Since all three linear maps AB, ip a P a B, AE a (p~ 1 are linear maps on the same 1-dimensional 
vector space U, it follows, from the definition of the determinant, that 

det(AB) = ^ det^P^dettAE^ 1 ). (9) 

<reS n (N) 

(The role of <p a is to force all maps be on the same space, so we can talk about determinants.) 
In the case where U = K n , V = R , this proves the classical Cauchy-Binet formula (1). 
If N = n, then we have shown that the determinant of the product is the product of the 
determinants. 

Therefore (1) follows from (9). The latter is a restatement of (8). But (8) is a special 
case of (6) because in (6) we allow U, V, W to be different with dimensions that may be 
distinct from n. 

4 Multilinear Parseval's identity 

We are now going to replace the middle space V of the previous setup by a separable 
Hilbert space H over the complex numbers C, having inner product (x,y). Let /i,/2, • • ■ 
be an orthonormal basis for H. Let /\ n H be the collection of all continuous alternating 
multilinear functionals uj : H n — *-C In particular, f\ H = H* is the Hilbert space dual 
of H. By the Riesz-Fischer theorem, /i,/2, • • • forms a basis for H in the sense that 
every u £ H can be uniquely written as uj[x] = YlT=i a v(fvi x )> f° r a <r S C such that 
\a a \ 2 < oo. More generally, /\ n H is a separable Hilbert space with orthonormal (with 
respect to a suitably defined inner product) basis 

faj A • • • A fa n , a = (<Ti, . . . , 0~ri) € «S„(N), 

where «S n (N) is the collection of all n-tuples (<ti, . . . , a n ) of positive integers such that a\ < 
■ ■ ■ < o~ n . Recall that the wedge product satisfies, by definition, 

(fiAf2)[x,y} = fi[x}f 2 [y]-f 1 [y}h[x], 

and, more generally, /^A- • -A/ CTn is obtained by antisymmetrization of the tensor product of 
/o-i ; • • • , fan ■ Incidentally, the direct sum 

of 0^°=o A™ H (where A° H := C) is the so-called 
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alternating Fock (or fermionic) space [11]. Wedge products can be defined, by linearity, 
between any finite number of elements of this space. 

If Hi , H2 are two Hilbert spaces and F : H\ — >■ H2 is a continuous linear function then 
F* : f\ n H2 — >■ f\ n Hi is defined as before-see (2)-and is, moreover, continuous. 

Theorem 2. Let H be a separable Hilbert space over C with orthonormal basis /i,/2, • • 
and let n be a positive integer. For each a £ «S n (N), let H a be the subspace spanned by 
f ai , . . . , fa n ■ Let E„ : H a — >■ H be the natural embedding of H a into H and P a : H — >■ H a 
the orthogonal projection of H onto H a . If U , W are finite- dimensional vector spaces over 
C and B : U — *■ H , A : H — >■ W continuous linear maps, then 

(AB)*= (P(rB)*(AE a )*. 

<reS„(N) 

IfW = U with common dimension n, and if ip a : H a — *- U is any linear bijection, then 

(AB)*= (VaPaBTiAE^- 1 )*. 
o-es„(N) 

In particular, 

det(AB) = ^ det(L Pa P a B)det(AE a ip- 1 ). 

<r6S n (N) 

The proof of this theorem is exactly as in the finite-dimensional case. Infinite sums have 
to be understood in the Hilbert space sense. 

Consider now H = L 2 [0, 1] with inner product (x,y) = J Q x(i)y(t)dt and the standard 
orthonormal basis e^it) = exp(i2irkt), k E Z, and let U = W = C n , for a given positive 
integer n. A continuous linear map A : L 2 [0, 1] — s-C n is necessarily (Riesz representation 
theorem) of the form 

Ax = ((x,ai), . . . , (x,a n )) = (J a\(t)x{t)dt, . . . , J a n (t)x(t)dt\ , x £ L 2 [0, 1], 

where ai, . . . , a n G L 2 [0, 1]. A linear map B : C n — >■ L 2 [0, 1] is of the form 

(Bu)(t) = uih(t) + ■■■ + u n b n (t), u € C n , 

where bi, . . . , b n € L 2 [0, 1]. Hence the jfe-entry of the matrix of AB : C n — ^C", with respect 
to the standard basis on C", is given by 

(AB) jk = [ 'a~{t)b k (t)dt. 
Jo 

Consider now a £ «S n (Z), i.e., a = (a±, . . . , a n ) € Z n with o\ < ■ ■ ■ < a n . (There is no 
difficulty in replacing N in the above theorem by Z.) Then H a is the subspace of L 2 [0, 1] 
spanned by e CT1 , . . . , e a „ . So the orthogonal projection P a : H — >■ H a is given by 

P a x = x(ai)e ai H h x(o- n )e an , 
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where 

x(k) 



/ x(t) exp(— i2nkt)dt, fc£Z, 
Jo 



are the Fourier coefficients of x. Letting (p a : H a — *~ C n be the linear bijection that takes 
e ar into the r-th standard basis vector of C n , for r = 1, . . . ,n, we see that the jfc-entry of 
the matrix of (p a P a B is 

(cp a P a B) jk = b k (aj). 
Arguing analogously, the j/c-entry of the matrix of AEa-ip^ 1 is 

{AE a (p~ l ) jk = aj(a k ). 

Hence the last formula of Theorem 2 gives 

l^kl ^ 6 *(*)*= E lf df<„ feK)] 



^ E • • • E det <n t dj . 



n! * — ' l<j,k<n L J l<j,, 

CTlgZ (TnSZ 

where the second equality follows from the fact that applying the permutation of (a±, . . . , a n ) 
to both matrices will change the sign of both determinants simultaneously and the fact that 
repeated indices result into zero determinants. For n = 1, this is the standard Parseval 
identity. 

Of course, there is nothing special with the Lebesgue measure. We can obtain formulas 
for any other I? space or other separable Hilbert spaces. 



5 Remarks 



My motivation for this article was due to my desire to understand some elements of random 
matrix theory [3] and determinantal point processes [6]. In particular, the derivation of the 
ubiquitous Tracy- Widom probability distribution [3] involves several applications of Cauchy- 
Binet type formulas. When I looked at it first, a standard computational proof was not too 
satisfactory. I discovered that there are many proofs, which can be roughly classified into 
combinatorial and algebraic ones. The version presented in this short article was inspired by 
the simple observation that the Cauchy-Binet formula is a version of Pythagorean theorem: 
it is a version of the Pythagorean theorem on /\ n ~R N , with n < N (which is of course 

isomorphic to 

Several years ago, Zeilberger [14] "complained" that, to most contemporary mathemati- 
cians, matrices and linear transformations are practically interchangeable notions and that 
the mainstream 'Bourbakian' establishment, with its profound disdain for the concrete, goes 
as far as to frown at the mere mention of the word 'matrix'. He then explains how "to [him], 
as well as to other 'dissidents' called 'combinatorialists', a matrix has nothing whatsoever 
to do with that intimidating abstract concept called 'a linear transformation between linear 
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vector spaces' " and, by thinking of matrices as putting weights on a graph, he develops 
a combinatorial way of interpreting and proving fundamental results such as the Cayley- 
Hamilton theorem. The Cauchy-Binet formula has found a nice proof, in the Zeilberger 
sense, as a corollary of the Gessel-Vienot lemma. We also mention Zeng's proof [15] which 
also uses Zeilberger's methods. 

In a sense then, what we have done here is in exactly the opposite of Zeilberger's spirit, 
because the proof presented uses nothing else but the concept of a linear map between vector 
spaces (and lots of definitions). Each point of view has its own merits in that, for instance, 
it leads to different kind of extensions. (Extensions to infinite matrices are not easy when 
the combinatorial point of view is adopted.) 

We finally remark that there are generalizations of the Cauchy-Binet formula for the case 
where the matrices contain elements of a noncommutative ring [9] . We do not know how to 
extend the ideas above to this case. 
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